Skip to content

feat: add RL checkpoint format backward compat integration test#2776

Open
samsja wants to merge 2 commits into
mainfrom
feat/ckpt-compat-test
Open

feat: add RL checkpoint format backward compat integration test#2776
samsja wants to merge 2 commits into
mainfrom
feat/ckpt-compat-test

Conversation

@samsja

@samsja samsja commented Jun 11, 2026

Copy link
Copy Markdown
Member

Generates a checkpoint with a pinned version of prime-rl (via git worktree) then verifies the current version can resume RL training from it.

Set CKPT_FORMAT_REF to a git tag/commit to enable cross-version testing. Defaults to current HEAD (same-version round-trip).


Note

Low Risk
Test and CI config only; no changes to training, checkpoint serialization, or runtime behavior in production code.

Overview
Adds a GPU integration test that exercises RL checkpoint write → resume so breaking checkpoint-format changes are caught in CI.

The flow runs start.toml (saves at step 3) then resume.toml (resume_step = 3) on the same output_dir, with the usual no-error and reward checks on both runs. Optional cross-version coverage: set CKPT_FORMAT_REF to a tag/commit and the generator run uses a detached git worktree at that ref (configs copied in); resume always runs from the current checkout.

gpu_tests.yaml adds a matrix job for tests/integration/test_ckpt_compat.py on the vm runner.

Reviewed by Cursor Bugbot for commit acf930d. Bugbot is set up for automated code reviews on this repo. Configure here.

Generates a checkpoint with a pinned version of prime-rl (via git worktree)
then verifies the current version can resume RL training from it.

Set CKPT_FORMAT_REF to a git tag/commit to enable cross-version testing.
Defaults to current HEAD (same-version round-trip).
@samsja samsja marked this pull request as ready for review June 11, 2026 21:28
src_configs = repo_dir / "configs" / "ci" / "integration" / "ckpt_compat"
dst_configs = worktree / "configs" / "ci" / "integration" / "ckpt_compat"
dst_configs.parent.mkdir(parents=True, exist_ok=True)
subprocess.check_call(["cp", "-r", str(src_configs), str(dst_configs)])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config copy nests existing directory

Medium Severity

When CKPT_FORMAT_REF points at a commit that already contains configs/ci/integration/ckpt_compat, cp -r into an existing destination creates a nested ckpt_compat/ckpt_compat tree. The RL run then uses the checked-out configs, not the copies from the current branch the test intends to inject.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 55beff1. Configure here.

process.wait(timeout=30)
except subprocess.TimeoutExpired:
process.kill()
process.wait()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout skips child process cleanup

Medium Severity

On subprocess timeout, _run_rl calls terminate/kill on the top-level uv process only. Other integration tests use cleanup_process, which recursively signals torchrun, inference, and other descendants started under uv run rl.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 55beff1. Configure here.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit acf930d. Configure here.

dst_configs.parent.mkdir(parents=True, exist_ok=True)
subprocess.check_call(["cp", "-r", str(src_configs), str(dst_configs)])

yield worktree

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worktree missing submodule checkout

Medium Severity

When CKPT_FORMAT_REF points at an older commit, the test runs uv run rl from a new git worktree but never initializes deps/* submodules there. pyproject.toml depends on those paths, so checkpoint generation in the pinned version fails even though CI already initialized submodules only in the main checkout.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit acf930d. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant